Goto

Collaborating Authors

 handwritten note


NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding

arXiv.org Artificial Intelligence

Understanding and reasoning over academic handwritten notes remains a challenge in document AI, particularly for mathematical equations, diagrams, and scientific notations. Existing visual question answering (VQA) benchmarks focus on printed or structured handwritten text, limiting generalization to real-world note-taking. To address this, we introduce NoTeS-Bank, an evaluation benchmark for Neural Transcription and Search in note-based question answering. NoTeS-Bank comprises complex notes across multiple domains, requiring models to process unstructured and multimodal content. The benchmark defines two tasks: (1) Evidence-Based VQA, where models retrieve localized answers with bounding-box evidence, and (2) Open-Domain VQA, where models classify the domain before retrieving relevant documents and answers. Unlike classical Document VQA datasets relying on optical character recognition (OCR) and structured data, NoTeS-BANK demands vision-language fusion, retrieval, and multimodal reasoning. We benchmark state-of-the-art Vision-Language Models (VLMs) and retrieval frameworks, exposing structured transcription and reasoning limitations. NoTeS-Bank provides a rigorous evaluation with NDCG@5, MRR, Recall@K, IoU, and ANLS, establishing a new standard for visual document understanding and reasoning.


InkFM: A Foundational Model for Full-Page Online Handwritten Note Understanding

arXiv.org Artificial Intelligence

Tablets and styluses are increasingly popular for taking notes. To optimize this experience and ensure a smooth and efficient workflow, it's important to develop methods for accurately interpreting and understanding the content of handwritten digital notes. We introduce a foundational model called InkFM for analyzing full pages of handwritten content. Trained on a diverse mixture of tasks, this model offers a unique combination of capabilities: recognizing text in 28 different scripts, mathematical expressions recognition, and segmenting pages into distinct elements like text and drawings. Our results demonstrate that these tasks can be effectively unified within a single model, achieving SoTA text line segmentation out-of-the-box quality surpassing public baselines like docTR. Fine- or LoRA-tuning our base model on public datasets further improves the quality of page segmentation, achieves state-of the art text recognition (DeepWriting, CASIA, SCUT, and Mathwriting datasets) and sketch classification (QuickDraw). This adaptability of InkFM provides a powerful starting point for developing applications with handwritten input.


The best E Ink tablets for 2024

Engadget

E-Ink tablets have always been intriguing to me because I'm a longtime lover of pen and paper. I've had probably hundreds of notebooks over the years, serving as repositories for my story ideas, to-do lists, meeting notes and everything in between. However, I turned away from physical notebooks at a certain point because it was just easier to store everything digitally so I always had my most important information at my fingertips. E-Ink tablets seem to provide the best of both worlds: the tactile satisfaction of regular notebooks with many of the conveniences found in digital tools, plus easy-on-the-eyes E-Ink screens. These devices have come a long way in the past few years, and we're just starting to see more color E-Ink tablets become more widely available. I tested out a number of different E Ink tablets to see how well they work, how convenient they really are and which are the best tablets using E Ink screens available today. An E Ink tablet will be a worthwhile purchase to a very select group of people. If you prefer the look and feel of an e paper display to LCD panels found on traditional tablets, it makes a lot of sense. They're also good options for those who want a more paper-like writing experience (although you can get that kind of functionality on a regular tablet with the right screen protector) or a more distraction-free device overall. The final note is key here.


InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write

arXiv.org Artificial Intelligence

Digital note-taking is gaining popularity, offering a durable, editable, and easily indexable way of storing notes in the vectorized form, known as digital ink. However, a substantial gap remains between this way of note-taking and traditional pen-and-paper note-taking, a practice still favored by a vast majority. Our work, InkSight, aims to bridge the gap by empowering physical note-takers to effortlessly convert their work (offline handwriting) to digital ink (online handwriting), a process we refer to as Derendering. Prior research on the topic has focused on the geometric properties of images, resulting in limited generalization beyond their training domains. Our approach combines reading and writing priors, allowing training a model in the absence of large amounts of paired samples, which are difficult to obtain. To our knowledge, this is the first work that effectively derenders handwritten text in arbitrary photos with diverse visual characteristics and backgrounds. Furthermore, it generalizes beyond its training domain into simple sketches. Our human evaluation reveals that 87% of the samples produced by our model on the challenging HierText dataset are considered as a valid tracing of the input image and 67% look like a pen trajectory traced by a human.


Russian soldier seen surrendering to Ukrainian drone speaks out for first time

FOX News

A Russian soldier was seen surrendering to a Ukrainian drone May 9 in edited video released by Ukraine's 92nd Mechanized Brigade. A Russian soldier whose surrender to Ukrainian forces was captured on drone camera, spoke for the first time about his experience. Ruslan Anitin, a draftee who was cornered alone by the Ukrainian military near the city of Bakhmut, surrendered by communicating via an aerial drone's camera. "It felt like it was never going to involve us at all," Anitin said of the conflict during an interview with the Wall Street Journal about his experience. A Russian soldier was seen surrendering to a Ukrainian drone May 9 in edited video released by Ukraine's 92nd Mechanized Brigade.


OpenAI's state-of-the-art machine vision AI is fooled by handwritten notes

#artificialintelligence

Researchers from machine learning lab OpenAI have discovered that their state-of-the-art computer vision system can be deceived by tools no more sophisticated than a pen and a pad. As illustrated in the image above, simply writing down the name of an object and sticking it on another can be enough to trick the software into misidentifying what it sees. "We refer to these attacks as typographic attacks," write OpenAI's researchers in a blog post. "By exploiting the model's ability to read text robustly, we find that even photographs of hand-written text can often fool the model." They note that such attacks are similar to "adversarial images" that can fool commercial machine vision systems, but far simpler to produce.


The Best AI Trend Is Yet To Come

#artificialintelligence

AI has made incredible progress over the last decade, and better tools and models are being developed every day. From GPU-Acceleration to Natural Language Processing progress, we have seen accelerators and enablers taking shape and move huge amounts of investments in the most recent past. Deepmind showed us just this week again that things thought to be impossible for another decade can become a reality in no time. Ranging from Smart Robots to Neuromorphic Hardware, we will have a look at the top 13 AI trends that will be on everyone's mind from now until 2025. I am in no way affiliated with any of the following companies.


An image classifier with Deep-Learning

#artificialintelligence

Getting young children to tidy up their rooms is often challenging. What I insist is messy, they will insist is clean enough. After all, all adjectives are subjective and I want my children to grow up respecting others' opinion in our inclusive society. How do you put some definition around differences in opinion? An objective way to achieve this distinction is using image classification to differentiate between a clean versus a messy room.


How Artificial Intelligence is Revolutionizing Personalized Medicine Mellanox Technologies Blog

#artificialintelligence

Imagine becoming gravely ill and yet being able to receive an accurate diagnosis with a recommended treatment plan in just 10 minutes. This is actually happening now with the help of Artificial Intelligence (AI). The University of Tokyo recently reported that Watson, IBM's cognitive supercomputer, correctly diagnosed a rare form of leukemia in a 60-year-old woman. Doctors originally thought the woman had acute myeloid leukemia, but after examining 20 million cancer research papers in 10 minutes, Watson was able to correctly determine the actual disease and recommend a personalized treatment plan. AI โ€“ and its related applications, Machine Learning (ML) and Deep Learning (DL) โ€“ are changing healthcare as we know it. The advancements made in AI will revolutionize research and, ultimately, personalized medicine.


Samsung and Google built their ideal Chromebook

Engadget

Late last year, I lamented that Google didn't make Chromebooks a priority over the holiday season. With Android apps and the Google Play Store coming to the platform, it seemed like a perfect time to push Chrome OS. As this morning's leak showed, I just needed to wait another month: Samsung and Google have just announced the Chromebook Plus and Chromebook Pro, a pair of laptops that strive to present the best Chrome OS experience a user can have. Let's get the difference between the two models out of the way early: The cheaper Chromebook Plus uses an ARM processor while the Chromebook Pro features an Intel Core M3 processor. Neither is the most powerful out there, but in my quick tests, the Chromebook Pro seemed plenty snappy. I will note that an ARM processor is probably never going to provide the best Chromebook experience one can have, but I'll grudgingly reserve judgement until really testing it out.